Search CORE

148 research outputs found

Building a Generation Knowledge Source using Internet-Accessible Newswire

Author: McKeown Kathleen R.
Radev Dragomir R.
Publication venue
Publication date: 01/01/1997
Field of study

In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it allows users to search for textual descriptions of entities; as a utility to generate functional descriptions (FD), it is used in a functional-unification based generation system. We present an evaluation of the approach and its applications to natural language generation and summarization.Comment: 8 pages, uses eps

arXiv.org e-Print Archive

CiteSeerX

Crossref

Columbia University Academic Commons

Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

Author: McKeown Kathleen R.
Siegel Eric V.
Publication venue
Publication date: 01/01/1996
Field of study

This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.Comment: postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

Paraphrasing Using Given and New Information in a Question-Answer System

Author: McKeown Kathleen R
Publication venue: ScholarlyCommons
Publication date: 01/01/1979
Field of study

The design and implementation of a paraphrase component for a natural language question-answer system (CO-OP) is presented. A major point made is the role of given and new information in formulating a paraphrase that differs in a meaningful way from the user\u27s question. A description is also given of the transformational grammar used by the paraphraser to generate questions

Crossref

ScholarlyCommons@Penn

Using the Annotated Bibliography as a Resource for Indicative Summarization

Author: Kan Min-Yen
Klavans Judith L.
McKeown Kathleen R.
Publication venue
Publication date: 01/01/2002
Field of study

We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

Coordinating Text and Graphics in Explanation Generation

Author: Kathleen R. Mckeown
Steven K. Feiner
Publication venue
Publication date: 01/01/1989
Field of study

To generate multimedia explanations, a system must be able to coordinate the use of different media in a single explanation. In this paper, we present an architecture that we have developed for COMET (COordinated Multimedia Explanation Testbed), a system that generates directions for equipment maintenance and repair, and we show how it addresses the coordination problem. In particular, we focus on the use of a single content planner that produces a common content description used by multiple media-specific generators, a media coordinator that makes a f'me-grained division of information between media, and bidirectional interaction between media-specific generators to allow influence across media.

CiteSeerX

Crossref

Resources for Evaluation of Summarization Techniques

Author: Kan Min-Yen
Klavans Judith L.
Lee Susan
McKeown Kathleen R.
Publication venue
Publication date: 01/01/1998
Field of study

We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of the corpora, methods used in the collection of user judgments, and an overview of the application of the corpora to evaluating the component system. Finally, we discuss the problems and issues with construction of the test set which apply broadly to the construction of evaluation resources for language technologies.Comment: LaTeX source, 5 pages, US Letter, uses lrec98.st

arXiv.org e-Print Archive

CiteSeerX

Prosody Modelling in Concept-to-Speech Generation: Methodological Issues

Author: Grosz B.
Kathleen R. McKeown
Shimei Pan
Silverman K.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2000
Field of study

We explore three issues for the development of concept-to-speech (CTS) systems. We identify information available in a language-generation system that has the potential to impact prosody; investigate the role played by different corpora in CTS prosody modelling; and explore different methodologies for learning how linguistic features impact prosody. Our major focus is on the comparison of two machine learning methodologies: generalized rule induction and memory-based learning. We describe this work in the context of multimedia abstract generation of intensive care (MAGIC) data, a system that produces multimedia brings of the status of patients who have just undergone a bypass operation

CiteSeerX

Crossref

Columbia University Academic Commons

From Text to Speech Summarization

Author: Galley Michel
Hirschberg Julia Bell
Maskey Sameer R.
McKeown Kathleen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

In this paper, we present approaches used in text summarization, showing how they can be adapted for speech summarization and where they fall short. Informal style and apparent lack of structure in speech mean that the typical approaches used for text summarization must be extended for use with speech. We illustrate how features derived from speech can help determine summary content within two ongoing summarization projects at Columbia University

CiteSeerX

Columbia University Academic Commons

Generating multimedia briefings: coordinating language and illustration

Author: Chang Shih-Fu
Dalal Mukesh
Feiner Steven K.
McKeown Kathleen R.
Publication venue: Published by Elsevier B.V.
Publication date: 31/08/1998
Field of study

AbstractCommunication can be more effective when several media (such as text, speech, or graphics) are integrated and coordinated to present information. This changes the nature of media-specific generation (e.g., language or graphics generation), which must take into account the multimedia context in which it occurs. This paper presents work on coordinating and integrating speech, text, static and animated three-dimensional graphics, and stored images, as part of several systems we have developed at Columbia University. A particular focus of our work has been on the generation of presentations that brief a user on information of interes

Elsevier - Publisher Connector